Search CORE

765 research outputs found

A double classification tree search algorithm for index SNP selection

Author: Sheng Huitao
Uehara Ryuhei
Zhang Peisen
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: In population-based studies, it is generally recognized that single nucleotide polymorphism (SNP) markers are not independent. Rather, they are carried by haplotypes, groups of SNPs that tend to be coinherited. It is thus possible to choose a much smaller number of SNPs to use as indices for identifying haplotypes or haplotype blocks in genetic association studies. We refer to these characteristic SNPs as index SNPs. In order to reduce costs and work, a minimum number of index SNPs that can distinguish all SNP and haplotype patterns should be chosen. Unfortunately, this is an NP-complete problem, requiring brute force algorithms that are not feasible for large data sets. RESULTS: We have developed a double classification tree search algorithm to generate index SNPs that can distinguish all SNP and haplotype patterns. This algorithm runs very rapidly and generates very good, though not necessarily minimum, sets of index SNPs, as is to be expected for such NP-complete problems. CONCLUSIONS: A new algorithm for index SNP selection has been developed. A webserver for index SNP selection is available a

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

JAIST Repository

Verifying Data Constraint Equivalence in FinTech Systems

Author: Fan Gang
Pan Fuxiong
Wang Chengpeng
Yao Peisen
Zhang Charles
Publication venue
Publication date: 26/01/2023
Field of study

Data constraints are widely used in FinTech systems for monitoring data consistency and diagnosing anomalous data manipulations. However, many equivalent data constraints are created redundantly during the development cycle, slowing down the FinTech systems and causing unnecessary alerts. We present EqDAC, an efficient decision procedure to determine the data constraint equivalence. We first propose the symbolic representation for semantic encoding and then introduce two light-weighted analyses to refute and prove the equivalence, respectively, which are proved to achieve in polynomial time. We evaluate EqDAC upon 30,801 data constraints in a FinTech system. It is shown that EqDAC detects 11,538 equivalent data constraints in three hours. It also supports efficient equivalence searching with an average time cost of 1.22 seconds, enabling the system to check new data constraints upon submission.Comment: 14 pages, 11 figures, accepted by ICSE 202

arXiv.org e-Print Archive

Gene functional similarity search tool (GFSST)

Author: Buetow Kenneth
Osborne Brian
Russo James J
Sheng Huitao
Zhang Jinghui
Zhang Peisen
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: With the completion of the genome sequences of human, mouse, and other species and the advent of high throughput functional genomic research technologies such as biomicroarray chips, more and more genes and their products have been discovered and their functions have begun to be understood. Increasing amounts of data about genes, gene products and their functions have been stored in databases. To facilitate selection of candidate genes for gene-disease research, genetic association studies, biomarker and drug target selection, and animal models of human diseases, it is essential to have search engines that can retrieve genes by their functions from proteome databases. In recent years, the development of Gene Ontology (GO) has established structured, controlled vocabularies describing gene functions, which makes it possible to develop novel tools to search genes by functional similarity. RESULTS: By using a statistical model to measure the functional similarity of genes based on the Gene Ontology directed acyclic graph, we developed a novel Gene Functional Similarity Search Tool (GFSST) to identify genes with related functions from annotated proteome databases. This search engine lets users design their search targets by gene functions. CONCLUSION: An implementation of GFSST which works on the UniProt (Universal Protein Resource) for the human and mouse proteomes is available at GFSST Web Server. GFSST provides functions not only for similar gene retrieval but also for gene search by one or more GO terms. This represents a powerful new approach for selecting similar genes and gene products from proteome databases according to their functions

Springer - Publisher Connector

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

Synthesizing Conjunctive Queries for Code Search

Author: Fan Gang
Tang Wensheng
Wang Chengpeng
Yao Peisen
Zhang Charles
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th European Conference on Object-Oriented Programming (ECOOP 2023)
Publication date: 01/01/2023
Field of study

This paper presents Squid, a new conjunctive query synthesis algorithm for searching code with target patterns. Given positive and negative examples along with a natural language description, Squid analyzes the relations derived from the examples by a Datalog-based program analyzer and synthesizes a conjunctive query expressing the search intent. The synthesized query can be further used to search for desired grammatical constructs in the editor. To achieve high efficiency, we prune the huge search space by removing unnecessary relations and enumerating query candidates via refinement. We also introduce two quantitative metrics for query prioritization to select the queries from multiple candidates, yielding desired queries for code search. We have evaluated Squid on over thirty code search tasks. It is shown that Squid successfully synthesizes the conjunctive queries for all the tasks, taking only 2.56 seconds on average

Dagstuhl Research Online Publication Server

Endoluminal Motion Recognition of a Magnetically-Guided Capsule Endoscope Based on Capsule-Tissue Interaction Force

Author: Arai Tatsuo
Ciuti Gastone
Dario Paolo
Hao Yang
Huang Qiang
Li Jing
Zhang Peisen
Zhang Weimin
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

A magnetically-guided capsule endoscope, embedding flexible force sensors, is designed to measure the capsule-tissue interaction force. The flexible force sensor is composed of eight force-sensitive elements surrounding the internal permanent magnet (IPM). The control of interaction force acting on the intestinal wall can reduce patient's discomfort and maintain the magnetic coupling between the external permanent magnet (EPM) and the IPM during capsule navigation. A flexible force sensor can achieve this control. In particular, by analyzing the signals of the force sensitive elements, we propose a method to recognize the status of the motion of the magnetic capsule, and provide corresponding formulas to evaluate whether the magnetic capsule follows the motion of the external driving magnet. Accuracy of the motion recognition in Ex Vivo tests reached 94% when the EPM was translated along the longitudinal axis. In addition, a method is proposed to realign the EPM and the IPM before the loss of their magnetic coupling. Its translational error, rotational error, and runtime are 7.04 ± 0.71 mm, 3.13 ± 0.47∘, and 11.4 ± 0.39 s, respectively. Finally, a control strategy is proposed to prevent the magnetic capsule endoscope from losing control during the magnetically-guided capsule colonoscopy

Archivio della ricerca della Scuola Superiore Sant'Anna

Photometric Stereo-Based Depth Map Reconstruction for Monocular Capsule Endoscopy

Author: Ciuti Gastone
Dario Paolo
Hao Yang
Huang Qiang
Li Jing
Meng Fei
Zhang Peisen
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

The capsule endoscopy robot can only use monocular vision due to the dimensional limit. To improve the depth perception of the monocular capsule endoscopy robot, this paper proposes a photometric stereo-based depth map reconstruction method. First, based on the characteristics of the capsule endoscopy robot system, a photometric stereo framework is established. Then, by combining the specular property and Lambertian property of the object surface, the depth of the specular highlight point is estimated, and the depth map of the whole object surface is reconstructed by a forward upwind scheme. To evaluate the precision of the depth estimation of the specular highlight region and the depth map reconstruction of the object surface, simulations and experiments are implemented with synthetic images and pig colon tissue, respectively. The results of the simulations and experiments show that the proposed method provides good precision for depth map reconstruction in monocular capsule endoscopy

Multidisciplinary Digital Publishing Institute

Archivio della ricerca della Scuola Superiore Sant'Anna

Recommended from our members

Optimal Step Length EM Algorithm (OSLEM) for the estimation of haplotype frequency and its application in lipoprotein lipase genotyping

Author: Gilliam T. Conrad
Morabia Alfredo
Sheng Huitao
Zhang Peisen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2003
Field of study

Background: Haplotype based linkage disequilibrium (LD) mapping has become a powerful and cost-effective method for performing genetic association studies, particularly in the search for genetic markers in linkage disequilibrium with complex disease loci. Various methods (e.g. Monte-Carlo (Gibbs sampling); EM (expectation maximization); and Clark's method) have been used to estimate haplotype frequencies from routine genotyping data. Results: These algorithms can be very slow for large number of SNPs. In order to speed them up, we have developed a new algorithm using numerical analysis technology, a so-called optimal step length EM (OSLEM) that accelerates the calculation. By optimizing approximately the step length of the EM algorithm, OSLEM can run at about twice the speed of EM. This algorithm has been used for lipoprotein lipase (LPL) genotyping analysis. Conclusions: This new optimal step length EM (OSLEM) algorithm can accelerate the calculation for haplotype frequency estimation for genotyping data without pedigree information. An OSLEM on-line server is available, as well as a free downloadable version

Columbia University Academic Commons

PubMed Central